Local symmetries in complex networks 
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Symmetry — invariance to certain operators — is a fundamental concept in many branches of physics. We propose 
ways to measure symmetric properties of vertices, and their surroundings, in networks. To be stable to the ran- 
domness inherent in many complex networks, we consider measures that are continuous rather than dichotomous. 
The main operator we suggest is permutations of the paths of a certain length leading out from a vertex. If these 
paths are more similar (in some sense) than expected, the vertex is a local center of symmetry in networks. We 
discuss different precise definitions based on this idea and give examples how different symmetry coefficients can 
be applied to protein interaction networks. 



PACS numbers: 89.75.Fb, 89.75.Hc 



I. INTRODUCTION 

Since the turn of the century, the field of complex net- 
works has been one of the most active areas of statistical 
physics U |2 B E3)- One of the central questions is to 
find quantities for measuring network structure (how a net- 
work differs from a random graph). The basic assumption 
is that the network structure is related to the function of the 
network. Thus, by measuring network structural quantities, 
one can say something both about the forces that created the 
network, and about how dynamic systems on the network 
behave. One important concept in many areas of physics 
(particle physics, condensed matter physics and more (5)) is 
symmetry — invariance to particular operators. Our approach 
is to presuppose that symmetry can be useful to study com- 
plex networks, then we try to construct a sensible and general 
framework for measuring symmetry in networks. 

In Ref. Q we define a measure for degree-symmetries in 
networks — a degree-symmetry coefficient. This is a local, 
vertex-specific measure, i.e. it includes only information from 
a bounded surrounding of the vertex. The fundamental oper- 
ator in this definition of degree-symmetry is permutations of 
paths of length / leading out from a vertex i. If the degree se- 
quences of paths of length I from i overlap to a great extent, 
then we say i is a center of degree-symmetry. In other words; 
if, regardless of which path we take out from i, we see the 
same sequence of degrees, then i is highly degree-symmetric. 
If one replaces degree, in this definition, by some other vertex- 




FIG. 1 An illustration of perfect degree symmetry (a) and perfect 
symmetry of external traits (b). In (a) different symbols represent 
different degrees. In (b) a symbol represents an external trait (in this 
paper we exemplify this by functional categories of proteins). (Non- 
self-intersecting) paths of length two will, in both cases, first lead 
to a triangle, then to a square, which means the circle is a center of 
symmetry. 



specific quantity, one gets a general framework for analyz- 
ing local symmetry — instead of degree symmetries, one can 
talk about clustering symmetries, betweenness symmetries or 
symmetries with respect to any other (network related or ex- 
ternal) vertex specific quantity. (See Fig. [I]) In this paper we 
will discuss such extensions of degree-symmetry coefficient. 
As one example we study functional symmetries in networks 
of proteins. 



II. DEFINITION OF THE MEASURE 

We consider a network modeled by an unweighted and 
undirected graph of N vertices, V; and M edges, E. We as- 
sume the graph have no multiple edges or self-edges. Let X(i) 
be a vertex trait or structural quantity — for example: degree, 
betweenness centrality Q B @) or a protein function. Con- 
sider a vertex i and the paths of length / leading out from this 
vertex. These paths can be thought of as the look of the net- 
work from the vantage point i. The cut-off length I reflects 
that the influence of the network i on i's function decreases 
with distance. In principle one can use any decaying function 
to lower the weight of distant vertices. We chose the sim- 
plest functional form (at least the easiest to implement) — a 
step function weighing vertices at a distance /, or less, from 
i equal (while yet more distant vertices are not considered at 
all). In the numerical examples, we will choose the shortest 
non-trivial value, I = 2. The sequences of X(f)-values along 
these paths are the input to the symmetry measure. We denote 
such sequences: 

: (l) 

>[X(<y),"--,*(< u )]}, 

where v 1 . , is the y'th vertex along the m'th path of length / 
leading out from i. Then let F(X, X') be a function measuring 
the similarity of two X-values (for integer valued X-functions, 
one example of an F-function is Kronecker's delta). A first at- 
tempt to construct a symmetry measure is to sum F(X(i), X(jJ) 
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for vertex pairs at the same distance from i in Qf (i), i.e. 

s -f = 2 tfwojfy, (2) 

0<n<n'<p /=! 

where 

A = (/-l)g). (3) 

This measure has many statistical discrepancies. For example, 
all paths that go via a particular neighbor of i contribute to the 
sum. In practice this means that vertices with a high degree 
vertex i at a distance close to / will (by virtue of the many 
paths that overlap up to i) trivially have a high 5,(0/ A. To get 
around this problem we omit path segments at indices lower 
than i in Qf (/) (for details, see Ref. (y)). Let S denote 
the number of such terms (a way to calculate S is given in 
Ref. Q)). Then a measure compensating for terms from path 
with the same beginnings is given by: 

#0= J< !° "Z/? ' provided A > 5,(0- (4) 

The degree sequence is often considered an inherent property 
of the system. Structure should, in such cases, be defined rela- 
tive to a null-model of random graphs conditioned to the same 
degree distribution as the network. A measure where zero de- 
notes neutrality can be constructed as: 

5,(o = #0 - (5) 

where ( • ) denotes average over an ensemble of random graphs 
with the same set of degrees as the original network. A way 
to sample such null-model graphs is to randomly rewire the 
edges of the original network (at every time step keeping the 
vertices' degrees are conserved). Note that, for such rewiring 
procedures, there are many sample-technical considerations 
needed to achieve ergodicity and statistical independence. We 
use the scheme proposed in Ref. Jl II) and 1000 sample av- 
erages. If the X-function only depends on the network, one 
can recalculate it for each individual realization of the null- 
model. If the information behind X(i) is external, then one 
has to let the trait be associated with i throughout the random- 
ization process, or randomly distribute the traits among the 
vertices. The former situation is suitable if the trait has some 
connection to the degree, the latter (that we use in this paper) 
is more appropriate if there are no such connections. 

To apply the framework described above one has to specify 
a function X mapping V to integer or real numbers. Further- 
more one has to chose an F-function indicating if two vertices 
are considered similar or not. In this paper we discuss binary 
valued F-functions (F{X(i), X{j)) = 1 if i and j are considered 
similar, F{X{i), X{j)) = otherwise), but one can also think 
of real valued F-functions where a high value means a high 
similarity between the two arguments. 

III. APPLICATIONS TO PROTEIN INTERACTION NETWORKS 

One of the most successful applications of complex net- 
work analysis is studies of large-scale microbiological net- 



works. Such studies can be performed at different levels of 
the cellular organization — from genetic regulation ytQ^Jh via 
protein interactions J7L fT5l) . to biochemical networks dl4tfl6l) . 
We will use protein interaction networks as our example. In 
protein interaction networks the vertices are typically an en- 
tire proteome (i.e. all proteins in an organism). The edges 
represent pairs of proteins than can bind physically to each 
other. It is important to note that at only a small fraction of 
the protein interactions is in effect at particular location in a 
particular cell. The biological information one can hope to 
get out from studying the protein interaction network is thus 
rather limited. Dynamic properties of the cellular activity, i.e. 
the functions of a particular cell, are beyond the reach of static 
network theory. The study of the protein interaction network, 
in this paper, serves more as an example of symmetry ana- 
lyzes, than an advance in proteomics. If symmetry has some 
relation to the protein functions, like degree is correlated with 
lethality one can use the symmetry coefficient for func- 
tional classification or prediction. 

The particular protein interaction data we use (from the 
yeast S. cerevisiae) was taken from MIPS ( 10) January 23, 
2005 (the same data set as used in Ref. (0)). The network has 
N = 4580 and M = 7434. MIPS also provide functional clas- 
sification of the proteins (1121) . This is a hierarchical classifica- 
tion where, for example, the top-level category "metabolism" 
is subdivided into e.g. "amino acid metabolism," and so on. 
One protein can be assigned none, one or many functional 
categories; so, to make a symmetry measure out of this infor- 
mation, let X(i) be the set of top-level functions of i, and let 

F(X X') = I * ifX = x ' 
1 otherwise 

We choose this F-function because it is the simplest. For 
a more thorough investigation of protein interaction symme- 
tries one might consider other functions, like the real valued 
Jaccard-index. 

Apart from the functional symmetry coefficient we will also 
measure the degree-symmetry coefficient as in Ref. (|5j). In 
this case X(i) is the degree, or number of neighbors, of i. For 
highly skewed degree distributions, as protein interaction net- 
works are known to have ( 8), it is appropriate to use: 

Fik — < ^ ^' such °' - ^'k' < fl +1 (7) 
10 otherwise 

We use a = 2 and i = 0, 1, 2, 3, • • • . 

In Fig. |2a) we give an example of a protein with high 
degree symmetry, YKROlOc. Since its neighbors are all 
equal (i.e. all pairs of neighbors (/,/') have F(kj,ki,) = 1) 
this is not surprising. Even many second-neighbors are 
equivalent in this respect (such as YLR377c, YNL113w and 
YNL099c). Fig. 0b) shows the functional overlap in the same 
subgraph. Although the overlapping vertex pairs are rather 
few, YKROlOc has a positive functional symmetry coefficient 
(rather weak, however, with a p-value of around five per- 
cent). The main reason for this is that similar vertices are 
very rare due to the quite strict definition of similarity (Eq.|6j. 
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FIG. 2 Example from the 5. cerevisiae protein interaction network illustrating the symmetries of YKROlOc. The concentric ellipses mark the 
first and second neighborhoods, (a) illustrates the configuration giving the symmetry coefficient 0.809. (b) illustrates the functional symmetries 
resulting in a functional symmetry coefficient of 0.299. The vertices connected by a shaded area have the identical sets of functions. 
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FIG. 3 The two-neighborhood of YGL250w in the S. cerevisiae protein interaction network. The symbols are the same as in Fig. |2| (a) 
shows the degree symmetry situation giving the symmetry coefficient -0.178. (b) shows the functional overlaps in the two-neighborhood of 
YGL250w giving a functional symmetry coefficient of 0.965. 
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Fig- El a ) shows a protein, YGL250w, with a negative degree- 
symmetry coefficient. The visual impression of skewness of 
YGL250w's two-neighborhood is, we believe, another aspect 
of this degree-asymmetry. In contrast, the functional symme- 
try coefficient of YGL250w vertex is large. As noted above, 
due to the many possible sets of functions (675 in total) func- 
tionally overlapping pairs are quite rare; yet in this example 
there are seven sets of overlapping pairs, or triplets at the same 
distance from YGL250w which explains the high functional 
symmetry. 

IV. DISCUSSION AND CONCLUSIONS 

In this paper we have proposed a general framework for 
measuring symmetries of the surrounding of a vertex. The 
basic idea is that observational processes often take the form 
of walks; in other words, that the symmetry means that the 
network looks the same along many paths leading out from a 
vertex. This leads us to the principle that if the set of paths of 
a limited length / out from a vertex i is invariant to permuta- 
tions, then i is a local center of symmetry. We exemplify this 
framework, and the derived symmetry coefficient, by study- 
ing the protein interaction network of S. cerevisiae. For this 
network databases catalog traits of the vertices, which allow 
two fundamentally different symmetries to be measured: the 
degree symmetry (where the similarity is related to the net- 
work structure) and functional symmetry (where the similar- 
ity stems from external information). These two coefficients 
are exemplified by two proteins in very different symmetry 
configurations (one with high degree symmetry and weakly 
positive functional symmetry, another with degree asymmetry 
and very high functional symmetry). We do not attempt to de- 
duce the biological meaning of the symmetry coefficients. But 
we can conceive that symmetry and biological function are re- 
lated from the presence of "network-motifs" ( fl3l) in biological 
networks. Network motifs are small, statistically overrepre- 
sented subgraphs with, presumably, specific functions. If one 
vertex controls, or is controlled by, several such motifs, then 
it would have high (degree, functional or other) symmetry co- 
efficient. To conclude, we believe symmetries can be a useful 
concept for analyzing complex networks. There are, further- 
more, many ways to extend this work to other measures and 
applications. 

Acknowledgments 

PH acknowledges financial support from the Wenner-Gren 
foundations, the National Science Foundation (grant CCR- 



0331580), and the Santa Fe Institute. 



References 

[1] R. Albert and A.-L. Barabasi. Statistical mechanics of complex 
networks. Rev. Mod. Phys, 74:47-98, 2002. 

[2] S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D.-U. 
Hwang. Complex networks: Structure and dynamics. Physics 
Reports, 424:175-308, 2006. 

[3] E. H. Davidson. The Regulatory Genome: Gene Regulatory 
Networks In Development And Evolution. Academic Press, 
Burlington MA, 2006. 

[4] S. N. Dorogovtsev and J. F. F. Mendes. Evolution of Networks: 
From Biological Nets to the Internet and WWW. Oxford Uni- 
versity Press, Oxford, 2003. 

[5] P. Holme. Detecting degree symmetries in networks. Phys. Rev. 
E, 74, 2006. e-print physics/0605029 

[6] P. Holme and M. Huss. Role- similarity based functional predic- 
tion in networked systems: application to the yeast proteome. 
J. Roy. Soc. Interface, 2:327-333, 2005. 

[7] J. Janin and S. J. Wodak, editors. Protein Modules and Protein- 
Protein Interactions, volume 61 of Advances in Protein Chem- 
istry. Academic Press, Amsterdam, 2002. 

[8] H. Jeong, S. P. Mason, A.-L. Barabasi, andZ. N. Oltvai. Lethal- 
ity and centrality in protein networks. Nature, 411 :41-42, 2001 . 

[9] M. E. J. Newman. The structure and function of complex net- 
works. SIAM Review, 45:167-256, 2003. 
[10] P. Pagel, S. Kovac, M. Oesterheld, B. Brauner, I. Dunger- 
Kaltenbach, G. Frishman, C. Montrone, P. Mark, V. Stumpflen, 
H. W. Mewes, A. Ruepp, and D. Frishman. The MIPS mam- 
malian protein-protein interaction database. Bioinformatics, 
21:832-834, 2004. 
[11] J. M. Roberts Jr. Simple methods for simulating sociomatrices 
with given marginal totals. Social Networks, 22:273-283, 2000. 
[12] A. Ruepp, A. Zollner, D. Maier, K. Albermann, J. Hani, 
M. Mokrejs, I. Tetko, U. Giildener, G. Mannhaupt, 
M. Miinsterkotter, and H. W. Mewes. The FunCat, a functional 
annotation scheme for systematic classification of proteins from 
whole genomes. Nucleic Acids Res., 32:5539-5545, 2004. 
[13] S. Shen-Orr, R. Milo, S. Mangan, and U. Alon. Network mo- 
tifs in the transcriptional regulation network of Escherichia coli. 
Nature Genetics, 31:64-68, 2002. 
[14] A. Wagner. Robustness and Evolvability in Living Systems. 

Princeton University Press, Princeton NJ, 2005. 
[15] S. Yook, Z. Oltvai, and A.-L. Barabasi. Functional and topo- 
logical characterization of protein interaction networks. Pro- 
teomics, 4:928-942, 2004. 
[16] J. Zhao, H. Yu, J. Luo, Z. W. Cao, and Y.-X. Li. Complex net- 
works theory for analyzing metabolic networks. Chinese Sci- 
ence Bulletin, 51:1529, 2006. 



